Uniform-in-phase-space data selection with iterative normalizing flows

نویسندگان

چکیده

Abstract Improvements in computational and experimental capabilities are rapidly increasing the amount of scientific data that routinely generated. In applications constrained by memory intensity, excessively large datasets may hinder discovery, making reduction a critical component data-driven methods. Datasets growing two directions: number points their dimensionality. Whereas dimension typically aims at describing each sample on lower-dimensional space, focus here is reducing points. A strategy proposed to select such they uniformly span phase-space data. The algorithm relies estimating probability map using it construct an acceptance probability. An iterative method used accurately estimate rare when only small subset dataset map. Instead binning map, its functional form approximated with normalizing flow. Therefore, naturally extends high-dimensional datasets. framework demonstrated as viable pathway enable data-efficient machine learning abundant available.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering and Classification through Normalizing Flows in Feature Space

A unified variational methodology is developed for classification and clustering problems, and tested in the classification of tumors from gene expression data. It is based on fluid-like flows in feature space that cluster a set of observations by transforming them into likely samples from p isotropic Gaussians, where p is the number of classes sought. The methodology blurs the distinction betw...

متن کامل

Variational Inference with Normalizing Flows

The choice of approximate posterior distribution is one of the core problems in variational inference. Most applications of variational inference employ simple families of posterior approximations in order to allow for efficient inference, focusing on mean-field or other simple structured approximations. This restriction has a significant impact on the quality of inferences made using variation...

متن کامل

Convolutional Normalizing Flows

Bayesian posterior inference is prevalent in various machine learning problems. Variational inference provides one way to approximate the posterior distribution, however its expressive power is limited and so is the accuracy of resulting approximation. Recently, there has a trend of using neural networks to approximate the variational posterior distribution due to the flexibility of neural netw...

متن کامل

Spinning Fast Iterative Data Flows

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk iterative algorithms are supported by novel dataflow frameworks, these systems cannot exploit computational dependencies present in many algorithms, such as grap...

متن کامل

Normalizing Flows on Riemannian Manifolds

We consider the problem of density estimation on Riemannian manifolds. Density estimation on manifolds has many applications in fluid-mechanics, optics and plasma physics and it appears often when dealing with angular variables (such as used in protein folding, robot limbs, gene-expression) and in general directional statistics. In spite of the multitude of algorithms available for density esti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Data-centric engineering

سال: 2023

ISSN: ['2632-6736']

DOI: https://doi.org/10.1017/dce.2023.4